Version 1.0 (13 August 2017)

Overview

What we will cover

  1. Prerequisities and how to get started
    • R and RStudio
    • Create a new RMarkdown document
  2. (R)Markdown 101
  3. Customising R markdown elements and outputs
    • Changing the RMarkdown output
    • Include tables and plots
    • Layouting/theming
  4. To interactivity and beyond!

6 interactive sessions.

Prerequisities and how to get started

Prerequisities 1

For this workshop

Presenters: Maurits Evers, Sebastian Kurscheid
Slides: https://github.com/mevers/workshop_RMarkdown

git clone https://github.com/mevers/workshop_RMarkdown

What you should bring / have brought

  1. A laptop with a recent RStudio release already installed
  2. Questions, questions, questions …
  3. (Optional) A sample text document for you to layout in RMarkdown

What you should be familiar with

  1. Basic R skills: Know what a dataframe is; how to produce a plot; how to inspect an element, …
  2. Basic Unix/Linux skills

Prerequisities 2

Important R packages

source("https://gist.github.com/mevers/ac8eccfe45b0638a8b9a258664c91741/raw/install_workshop_libs.R");

Example data

We will use the starwars dataset from dplyr.

head(starwars, n = 3);
## # A tibble: 3 x 13
##             name height  mass hair_color  skin_color eye_color birth_year
##            <chr>  <int> <dbl>      <chr>       <chr>     <chr>      <dbl>
## 1 Luke Skywalker    172    77      blond        fair      blue         19
## 2          C-3PO    167    75       <NA>        gold    yellow        112
## 3          R2-D2     96    32       <NA> white, blue       red         33
## # ... with 6 more variables: gender <chr>, homeworld <chr>, species <chr>,
## #   films <list>, vehicles <list>, starships <list>

How to get started 1

Create new RMarkdown (Rmd) document in RStudio

  1. File \(\rightarrow\) New File \(\rightarrow\) R Markdown

How to get started 2

  1. Specify output option

  2. Render

Interactive session 1

  1. If you need help with setting up RStudio and R packages, now is the time to let us know…

  2. Create a new R Markdown (Rmd) document in RStudio.
  3. Select PDF, Word, or HTML as output option.
  4. Render the RMarkdown document to produce the output document.
  5. Locate and open the final output file.
  6. Change the output format of an existing RMarkdown document within RStudio.
  7. Add some text to your RMarkdown file.
    You can either use your own text or download a text file and copy & paste its content into your RMarkdown document.

    # R code
    download.file(
      "https://gist.github.com/mevers/f4b149520f32cfda538a43c31428d7c5/raw/sample_text.txt",
      "sample_text.txt");

Under the hood of RStudio

Document conversion

  1. RStudio's Rmd \(\rightarrow\) (PDF, HTML, Word) uses rmarkdown::render()
    (previously knitr::pandoc(), now deprecated)
  2. rmarkdown::render() uses the universal document converter pandoc
    (automatically installed as part of RStudio)

For example, pandoc will translate

# Header

into the following output-specific lines:

HTML output \(\LaTeX\) output
<h1>Header</h1> 
\section{Header} 

(R)Markdown 101

(R)Markdown 101

Beware the different Markdown flavours: GitHub, R Markdown, vanilla Markdown, …

Here: Focus on RMarkdown, and the following elements

  1. Headers
  2. Emphasis: bold and italics (note: no underline)
  3. Lists: Ordered and unordered
  4. Images and links
  5. Blockquotes
  6. Rules and line breaks

More details:

(R)Markdown 101 – continued

Headers

# Header 1
## Header 2
### Header 3
#### Header 4

Emphasis

*italic*, _italic_
**bold**, __bold__

Lists

1. Level 1
    + Item 1
    + Item 2
2. Level 2
    + Item 3
    + Item 4
3. Level 3

Images and links

![Alt text](URL)
[RStudio](https://www.rstudio.com/)
[email@email.com](mailto:email@email.com)

Blockquote

> RStudio makes R easier to use. It includes a 
  code editor, debugging & visualization tools.

Rules

----, ****

Manual line breaks
End line with two or more whitespaces.

Line 1 ends here,␣␣
line 2 start here.

(R)Markdown 101 – continued

Tables
Assemble list of words, and divide them with hyphens - (for the first row), and then separating each column with a pipe |.

First Header | Second Header
------------ | -------------
Cell 1 | Cell 2
Cell 3 | cell 4

Fenced code blocks
Lines wrapped within an environment with leading and tailing ``` are converted into a code block.
An optional language identifier provides syntax highlighting.

```bash
echo "3 + 4"
```

R code blocks will be evaluated and printed (replace double with single curly brackets).

```{{r}}
3 + 4
```

Inline code
Use single backticks ` (delete curly brackets)

`{r} summary(starwars)`

Equations
The power of \(\LaTeX\) (MathJax)

$x = a$
$$ int_{x=0}^\infty dx \log{1+x} $$

Interactive session 2

  1. Experiment with different markdown elements in your RMarkdown document
    • Introduce headers
    • Emphasise words (e.g. Star Wars characters in the text) by making them italic
    • Create a Wikipedia/Wookieepedia link to a planet's name.
  2. Evaluate R code and show results
    • Return Tukey's five number summary of the height of all starwars characters
    • Show a summary table of the gender distribution based on all starwars characters
    • Show those characters of the starwars dataframe that have no reported gender
  3. Render the document and check the resulting output file
  4. Are there any specific text elements you would like to typeset? E.g. underlining/strikethrough text?

(R)Markdown 101 – Output-specific layouting

You can use language-specific expressions in RMarkdown documents.

For example:

  1. In HTML documents, you can use
    • <img src="path/to/image.png" style="width:200px;"> to place an image,
    • <hr> to place a horizontal rule,
    • <ol start="10"><li>...</ol> to create an ordered list,
    • most HTML code tags (e.g. <u>, <del>, …).
  2. In PDF documents, you can use
    • \includegraphics[width=\textwidth]{...} to place an image,
    • \begin{minipage}{.5\textwidth}...\end{minipage} to fine-tune horizontal and vertical text/figure layout,
    • a lot of the \(\LaTeX\) syntax.

Beware: A HTML statement won't be recognised if you render your RMarkdown file as a Word document.

RMarkdown elements

RMarkdown style advice

  1. Uniquely name your code chunks (replace double with single curly brackets)

     ```{{r chunk_name, ...}}
    • increases readibility
    • helps with caching
  2. Whitespace is important:
    • Two whitespaces at the end of a line identify a manual linebreak
    • A list requires a preceding newline
      (this is not the case in e.g. GitHub-flavoured markdown)
  3. Don't dev.off(): Devices are handled internally!

More Markdown-generic details:
Markdown style guide

RMarkdown output

The YAML header

---
title: "Main title"
subtitle: "Subtitle"
author: "Firstname Lastname"
date: "07/08/2017"
output:
  word_document
---

Main keys

  1. Title and subtitle
  2. Author (can have multiple authors as list for key author)
  3. Date (can be R code)
  4. Output option: html_document, pdf_document, word_document, ioslides_presentation, beamer_presentation

Interactive session 3

  1. Make changes to the existing YAML header keys title and author.
  2. Add an email address link to the author's name.
  3. Change the date to automatically give the current "DayOfTheWeek, Day MonthName YearWithCentury".
  4. Add a subtitle to your document.
  5. Add a second authors.
  6. Change the output from a Word document to an ioslides presentation by changing the YAML header.

Hint:

# R code
?strptime
# format(Sys.time(), ...)

Controlling the R output: Code chunk options

Some useful options

  • eval=FALSE: Don't run code.
  • include=FALSE: Run code but don't include the chunk in the output document.
  • echo=FALSE: Don't show code, show results.
  • results='hide': Show code, don't show results.
  • message=FALSE: Don't show any additional R messages.
  • error=FALSE: Don't show R error messages.
  • warning=FALSE: Don't show R warning messages.
  • cache=TRUE: Use cached results (if available) until the code chunk is changed.
  • fig.width=7, fig.height=7: Set actual figure width and height to 7 inches.
  • out.width=5, out.height=5: Set output document figure width and height to 5 inches; if fig.width, fig.height are different, the output is scaled.

More chunk options can be found e.g. in the R Markdown Reference Guide and in knitr's documentation.

Interactive session 4

  1. Returning to your starwars gender summary table, output only the table without showing the R code.
  2. Write R code to store the starwars dataframe in an external CSV file. Show neither code nor any output.
  3. Show the code for generating a table of how many characters participated in every Star Wars film; then
  4. Generate a barplot of how many characters participated in every Star Wars film.
  5. Change the dimension of the (gg)plot.

Hint:

# R code
# Extract all films per character
df <- cbind.data.frame(film = unlist(starwars$films));
ggplot(data = df, aes(film)) + geom_bar();

Equations

Use standard \(\LaTeX\) syntax.

  1. Inline equations: $x=a$ produces \(x=a\).
  2. Display equations: $$ x = a $$ produces \[x=a\,.\]

Complex multi-line equation example:

 $$
 \begin{aligned}
 \frac{dq_i}{dt} &= \frac{\partial H}{\partial p_i} \\
 \frac{dp_i}{dt} &= -\frac{\partial H}{\partial q_i}
 \end{aligned}
 $$

\[ \begin{aligned} \frac{dq_i}{dt} &= \frac{\partial H}{\partial p_i} \\ \frac{dp_i}{dt} &= -\frac{\partial H}{\partial q_i} \end{aligned} \]

Tables

Different ways to present tabular data.

  1. Manual table creation

    First Header | Second Header
    ------------ | -------------
    Cell 1 | Cell 2
    Cell 3 | cell 4
  2. knitr::kable() (all output)
  3. pander::pandoc.table() (all output)
  4. xtable::xtable() (only HTML and \(\LaTeX\) output)
  5. stargazer::stargazer() (only HTML and \(\LaTeX\) output)

  6. DT::datatable() (only HTML output, interactive)

Tables: Using knitr::kable()

# Load R package knitr and show table
suppressMessages(library(knitr));
kable(starwars[1:6, 1:6])
name height mass hair_color skin_color eye_color
Luke Skywalker 172 77 blond fair blue
C-3PO 167 75 NA gold yellow
R2-D2 96 32 NA white, blue red
Darth Vader 202 136 none white yellow
Leia Organa 150 49 brown light brown
Owen Lars 178 120 brown, grey light blue

Tables: Using DT::datatable()

# Load R package DT and show table
suppressMessages(library(DT));
DT::datatable(starwars[, 1:10], options = list(pageLength = 6, scrollX = TRUE));

Interactive session 5

  1. Manually construct a table with the alphabetical Star Wars movie titles in column 1, and your personal review score as a number between 1 (lowest) to 10 (highest) in column 2.
  2. Write R code to show the first six columns of starwars in a Word document using a suitable R method. Change the height from [cm] to [m], and allow for 2 digits after the comma.
  3. Write R code to show the first 10 columns of starwars in a HTML document using a suitable R method.

Note: Depending on the R method, you might have to install additional R packages, e.g. for stargazer:

# R code
# Install (if necessary) and load R package 'stargazer'
if (!require("stargazer", quietly = TRUE)) {
  install.packages("stargazer", repos = "http://cran.rstudio.com/");
  require("stargazer", quietly = TRUE)}

Static and interactive plots and figures

Static plot: ggplot2::ggplot()

df <- cbind.data.frame(skin_color = unlist(lapply(starwars$skin_color, strsplit, ", "))); 
ggplot(df, aes(skin_color)) + geom_bar() + theme(axis.text.x = element_text(angle=90, hjust=1, vjust=0.5));

Interactive plots: plotly::ggplotly()

ggplotly(ggplot(starwars, aes(x = height, y = mass, label = name)) + geom_point() 
  + stat_smooth(fullrange = TRUE, method = "lm", na.rm = TRUE) + theme_bw(), height = 460, width = 960);

Interactive plots: scatterD3::scatterD3()

scatterD3(x = starwars$height, y = starwars$mass, lab = starwars$name);

Interactive session 6

  1. Show the height and log10-transformed age distributions of the starwars characters as density plots.
  2. Show an interactive barplot distribution of eye colors of the starwars characters using the R package plotly.
  3. Based on the starwars characters, assess results from a linear model \[ \text{height} = \beta_0 + \beta_1 \times \log_{10}\text{mass}\,. \]
    • Show fit results as a table.
    • Show the regression curve.
    • Is mass a statistically significant predictor?

Beyond interactive plots

Example 1: RMarkdown and shiny integration

Example 2: Advanced plots

  • geo-spatial plots,
  • parallel-coordinates,

Example 3: Bookdown

Thanks